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' C" | " We report the results of intensive numerical calculations for four atomic H 2 +H 2 energy transfer collision. A 

parallel computing technique based on LAM/MPI functions is used. In this algorithm, the data is distributed to the 
processors according to the value of the momentum quantum number J and its projection M. Most of the work is 
local to each processor. The topology of the data communication is a simple star. Timings are given and the scaling 
of the algorithm is discussed. Two different recently published potential energy surfaces for the H2— H 2 system 
are applied. New results obtained for the state resolved excitation-deexcitation cross sections and rates valuable for 
astrophysical applications are presented. Finally, more sophisticated extensions of the parallel code are discussed. 

• Sh . Keywords: Parallel algorithm, LAM/MPI application, Star-type cluster, quantum dynamics. 

o I. Introduction 

In modern competitive research in science and technology high performance computing plays a paramount role. Its 
importance is derived from the fact, that correctly chosen and designed numerical methods and algorithms properly 
adapted to parallel and multithreaded techniques can essentially reduce computation time and active memory usage 
[1]. The importance of this fact is especially magnified in calculating quantum molecular dynamics and atomic 
O collisions due to their massive complexity. 

Generally speaking, modern computation research in scientific applications has taken two twists. First, to provide 
SO ' efficient and stable numerical calculations, and second to provide for the proper use of various high performance 
O . techniques like LAM/MPI, OpenMP and/or others [2]. Now it is equally important not only to get the correct 
I numerical results, but also to design and implement efficient high performance algorithms and get faster results 
. ' with less memory. We would like to note here, that a program/software, which is designed for specific problems 
in computational physics, chemistry or biology should be able to perform calculations in either serial or parallel. 
I The problem we selected for our parallel computation in this work is taken from molecular/chemical physics. 

Specifically we carry out detailed quantum-mechanical calculations of state-resolved cross sections and rates 
■ in hydrogen molecular collisions H2+H2. Interaction and collisions between hydrogen molecules, and hydrogen 
^ , molecular isotopes, for example H2+HD, is of great theoretical and experimental interest for many years [3-14]. 
' Specifically we will explore the quantum-mechanical 4-atomic system shown in Fig. 1 using six independent 
variables resulting in the full description of the system. The main goal of this investigation is to carry out a 
comparative analysis of two recently published potential energy surfaces (PESs) for H2— H2. 

Our motivation for selecting this problem is, that the hydrogen molecule plays an important role in many areas of 
astrophysics [15-16] This is the simplest and most abundant molecule in the universe especially in giant molecular 
clouds. Because of low number of electrons in H2— H2 this is one of few four-center systems for which potential 
energy surface (PES) can be developed with very high precision. Therefore H2+H2 may be also a benchmark 
collision for testing other dynamical methods. Additionally, the H2+H2 elastic and inelastic collisions are of interest 
in combustion, spacecraft modeling and at the present hydrogen gas is becoming a very important potential energy 
supplier, see for example [17]. 

We test two PESs: the first one is a global 6-dimensional potential from work [18], the second one is very 
accurate interaction potential calculated from the first principles [19]. Because we are going to carry out detailed 
quantum-mechanical calculations using two PESs the computation work is at least doubled and therefore even 
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Fig. 1. 4-body coordinates for the H2— H2 system used in this work. 



more time consuming. We needed to carry out convergence tests with respect to different chemical and numerical 
parameters for both PESs and, finally, we have to make production calculations for many points of kinetic energy 
collisions. Clearly, an application of parallel computing techniques shall be very useful in this situation. 

In this work we carry out parallel computation with up to 14 processors. The scattering cross sections and their 
corresponding rate coefficients are calculated using a non reactive quantum-mechanical close-coupling approach. 
In the next section we will shortly outline the quantum-mechanical method and the parallelization approach. Our 
calculations for H2+H2, scaling and timing results are presented in Sec. III. Conclusions are given in Sec. IV. 
Atomic units (e=m e =?i=l) are used throughout the work. 



A. Quantum-mechanical approach 

In this section we briefly represent a quantum-mechanical approach and the parallel algorithm used in this work. 
The 4-atomic H2— H2 system is shown in Fig. 1. It can be described by six independent variables: r\ and r2 
are interatomic distances in each hydrogen molecule, 61 and 62 are polar angles, is torsional angle and R is 
intermolecule distance. The hydrogen molecules are treated as linear rigid rotors, that is distances r\ = r 2 = 0.74vl 
are fixed in this model. We provide a numerical solution for the Schrodinger equation for an ab + cd collision in 
the center of the mass frame, where ab and cd are linear rigid rotors. 

To solve the equation the total 4-atomic H2+H2 wave function is expanded into channel angular momentum 
functions <f>^ (fi , r 2 , R) [4]. This procedure followed by separation of angular momentum provides a set of coupled 
second order differential equations for the unknown radial functions U^ M (R) 
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Fig. 2. Schematic diagram of the topology of interprocessor communication: an example of a star-type cluster with one main and four 
satellite computers. 

where a = {jijzjviL), ji + j% = ji 2 + L = J and j\,j2,L are quantum angular momentum corresponding 
to vectors f\, r*2 and R respectively, M\2 = (m a + mb)(m c + md)/{m a + mj, + m c + nid), V(fi,r2, R) is the 
potential energy surface for the 4-atomic system abed, and k a is channel wavenumber. 

We apply the hybrid modified log-derivative-Airy propagator in the general purpose scattering code MOLSCAT 
[20] to solve the coupled radial equations Q. We have tested other propagator schemes included in the code. Our 
calculations showed that other propagators are also quite stable for both the H2— H2 potentials considered in this 
work. 

Since all experimentally observable quantum information about the collision is contained in the asymptotic 
behaviour of functions U^ M (R — > 00), the log-derivative matrix is propagated to large i?-intermolecular distances. 
The numerical results are matched to the known asymptotic solution to derive the physical scattering S-matrix [4]. 
The method was used for each partial wave until a converged cross section was obtained. It was verified that results 
are converged with respect to the number of partial waves as well as the matching radius R m ax for all channels 
included in the calculations. Cross sections for rotational excitation and relaxation phenomena can be obtained 
directly from the S'-matrix. In particular the cross sections for excitation from jij 2 —* j'ij'2 summed over final 
m'im'2 and averaged over initial m\m2 are given by 

o{j'i,32;hh,e) = V(2ji + l)(2j 2 + 1)*W E (2.7 + 1)|<W " S J (j[, j' 2 , j[ 2 L';h, j 2 , 312 , L; E)\ 2 . (2) 

Jji2j' 12 LL' 

The kinetic energy is e = E — -£>iji(ji + 1) — B^jzijz + 1)? where -B 1 ( 2 ) are rotation constants of rigid rotors ab 
and cd respectively. 

The relationship between a rate coefficient kj^-yj'j^T) and the corresponding cross section (Tj 1 j 2 -^j^j^(Ef.i n ) 
can be obtained through the following weighted average 

^WPO = jf (3) 

where T is temperature, ks is Boltzmann constant, /i is reduced mass of the molecule-molecule system, and e s is 
the minimum kinetic energy for the levels j\ and j 2 to become accessible. 
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Fig. 3. J/M-parallelization method, see text. 



B. Parallelization 

In this work to support parallel computation the following machines are used: Sun Netra-Xl (UltraAX-i2) with 
128 MB RAM (512 MB Swap) and 500 Mhz UltraSPARC-He processor. The master computer is SunFire v440 
with 8 GB RAM four 1.062 Ghz UltraSPARC-IIIi processors. The system is schematically shown in Fig. 2. In this 
work we apply LAM/MPI to provide the parallel environment in the cluster. 

It is important in the parallel algorithm used in this work, that calculations for specific values of J and M 
are essentially independent. In the PMP MOLSCAT program [21], which is used the parallelization is done over 
the loop on values J and M. The code distributes the required JM pairs across the available processors. The 
computational work distribution is shown schematically in Fig. 3. The same idea has been used in works [22], [23] 
for semiquantal atomic collisions. In these works the parallelization was done along the impact factor p of colliding 
particles, because the solution of the resulting dynamical equations doesn't depend on p. It is well known, that in 
the semiclassical approach the impact factor p is an analog of quantum J number. 

As mentioned above, in the quantum-mechanical approach used in this work, a partial wave expansion is applied. 
A set of coupled channel differential equations has to be solved for many values of the total angular momentum J. 
To calculate the state resolved cross sections and then the rate coefficients © the resulting S-matrix elements have 
to be summed from different Js. Calculations for a single J can be broken into two or more sectors corresponding 
to different values of M, which is a projection of J. 

There are two methods to distribute the work among satellite computers. In the static method in the beginning 
of the job each computer makes a list of the total J/M tasks to be solved. Then each computer selects a subset 
of the tasks to carry out. Obviously each computer has to get a different subset and an approach needs to be used 
which gives an approximately equal amount of work to each computer. There is no interprocessor communication 
in this method. 

In the case of a dynamic approach one computer acts as a dispatcher. It makes a list of all the J/M tasks to 
be done, then waits for the computational processes to call in requesting work. Starting with the longest tasks, the 
dispatcher hands out J/M tasks to computing processes until all of them have been completed. The next time the 
computational process asks for work, the dispatcher sends it a message, and the computational process then does 
its end-of-run cleanup and exits. 

III. Results 

Our results from the parallel calculations using MPI functions to determine rotational transitions in collisions 
between para/para- and ortho-/ortho-hydrogen molecules: 

H 2 (ii ) + H 2 ( h ) - H 2 (j[ ) + H 2 (j' 2 ) . (4) 

are presented in this section together with scaling results. 

As we mentioned in the Introduction we are applying the new PESs from the works [18] and [19]. The DJ PES 
[19] is constructed for the vibrationally averaged rigid monomer model of the H 2 — H 2 system to the complete basis 
set limit using coupled-cluster theory with single, double and triple excitations. A four term spherical harmonics 
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expansion model was chosen to fit the surface. It was demonstrated, that the calculated PES can reproduce the 
quadrupole moment to within 0.58 % and the experimental well depth to within 1 %. 

The bond length was fixed at 1.449 a.u. or 0.7668 A. DJ PES is defined by the center-of-mass intermolecular 
distance, R, and three angles: Q\ and #2 are the plane angles and 4>\2 is the relative torsional angle. The angular 
increment for each of the three angles defining the relative orientation of the dimers was chosen to be 30°. 

The BMKP PES [18] is a global six-dimensional potential energy surface for two hydrogen molecules. It was 
especially constructed to represent the whole interaction region of the chemical reaction dynamics of the four-atomic 
system and to provide an accurate as possible van der Waals well. In the six-dimensional conformation space of 
the four atomic system the conical intersection forms a complicated three-dimensional hypersurface. The authors 
of the work [18] mapped out a large portion of the locus of this conical intersection. 

The BMKP PES uses cartesian coordinates to compute distances between four atoms. We have devised some 
fortran code, which converts spherical coordinates used in Sec. 2 to the corresponding cartesian coordinates and 
computes the distances between the four atoms. In all our calculations with this potential the bond length was fixed 
at 1.449 a.u. or 0.7668 A as in DJ PES. 

The main goal of this work is to carry out quantum-mechanical calculations for different transitions in p- 
H2+P-H2 and 0-H2+0-H2 collisions and to provide a comparative study of the two PESs presented above. The 
energy dependence of the elastic integral cross sections a e i(Ekin) are represented in Fig. 4 (upper plots) together 
with the state -resolved integral cross sections aj 1 j 2 ^j'ji ! (Ekin) for the j\ = 32 = — ► j[ = 2,j' 2 = 2 and 
ji = J2 = 1 — ► j'i = l,j 2 = 3 rotational transitions (lower plots) for both the BMKP and DJ PESs respectively. 
As can be seen both PESs provide the same type of the behaviour in the cross section. These results are in basic 
agreement with recent calculations, but using a time-dependent quantum-mechanical approach [10]. Our calculation 
show, that DJ PES generates higher values for the cross sections. 

A large number of test calculations have also been done to secure the convergence of the results with respect to 
all parameters that enter into the propagation of the Schrodinger equation. This includes the intermolecular distance 
R, the total angular momentum J of the four atomic system, the number of rotational levels to be included 
in the close coupling expansion and others (see the MOLSCAT manual [20]). 

We reached convergence for the integral cross sections, a(Ekin), in all considered collisions. In the case of 
DJ PES the propagation has been done from 2 A to 10 A, since this potential is defined only for those specific 
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Fig. 5. Temperature dependence of the state-resolved thermal rate constant (left panel) and corresponding cross section (right panel) for 
the transition ji = j 2 = — » j[ = 2, j' 2 = 0. Results from other works for the thermal rate fcoo-t>2(r) are a ls° included. The results for 
the DJ PES are given in solid lines. The diamonds are the theoretical data of this work calculated with the BMKP PES. 



distances. For the BMKP PES we used r m j n = 1 A to r max = 30 A. We also applied a few different propagators 
included in the MOLSCAT program. 

A convergence test with respect to the maximum value of the total orbital momentum showed, that J ma x = 100 
is good enough for the considered range of energies in this work. We tested various rotational levels j\j2 included 
in the close coupling expansion for the numerical propagation of the resulting coupled equations Q. In these test 
calculations we used two basis sets: jij2=00, 20, 22, 40, 42 with total basis set size N[ v i = 13 and jij2=00, 20, 
22, 40, 42, 44, 60, 62 with N M = 28. We found [24], that the results are quite stable for the 00^20 and 00^22 
transitions and somewhat stable for the highly excited 00^40 transition. Nontheless, for our production calculations 
we used the first basis set. 

It is important to point out here, that for comparison purposes we don't include the compensating factor of 2 
mentioned in [5]. However, in Fig. 4 (upper plots) and in our subsequent calculations of the thermal rate coefficients, 
kjj/(T), the factor is included. 

The differences in the cross sections of the two potentials are reflected in the state-resolved transition states 
ji = 0,j2 = — > j[ = 0,j 2 = 2, as shown in Fig. 5 (right panel). It seems that the DJ PES can provide much 
better results, as seen in the same figure in the left panel, when we present the results for the corresponding 
thermal rates /coo-t^T 1 ) calculated with the DJ potential together with results of other theoretical calculations. The 
agreement is perfect. Thus, one can conclude, that DJ PES is better suited for the H2— H2 system. In Fig. 6 we 
provide thermal rates for different transition states calculated with only the DJ PES and in comparison with other 
theoretical data obtained within different dynamical methods and PESs. Again the agreement is very good. 

In Fig. 7 we present an example of our timing results using the dynamic method for a specific H2(ji)+H2(j2) 
calculations. It can be seen, that including additional processors reduces the computation time. Here we present 
two results. The left plot shows dependence of the computing time on amount of active parallel processors. The 
right plot illustrates the degree of speed-up of the calculations. The speed-up for a fixed test calculation is defined 
as t\/t n , where t\ is the calculation with only one processor and t n with n p processors. 

IV. Conclusion 

We carried out parallel computations for state -resolved rotational excitation and deexcitation cross sections and 
rates in molecular para-lpara- and ortho-/ortho-H2 collisions of astrophysical interest. The LAM/MPI technique 
allowed us to speed up the computation process at least ~ 4.5 times within our 14 processor Sun Unix cluster. 
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Fig. 7. Computation time and speed-up t\/t np depending on number of parallel processors n p using the dynamic approach, see text. 
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We tested the two newest potential energy surfaces for the considered systems. Thus the application of the parallel 
algorithm reduced the computation time used to test the two potentials. A test of convergence and the results for 
cross sections and rate coefficients using two different potential energy surfaces for the H2— H2 system have been 
obtained for a wide range of kinetic energies. 

We would like to point out here, that the hydrogen problem is very important for many reasons. The main 
motivation has been described in the introduction of this paper. It is also necessary to stress, that the hydrogen- 
hydrogen collision may be particularly interesting in nanotechnology applications, when the system is confined 
inside a single wall carbon nanotube (SWNT) [25]. 

Careful treatment of such collisions can bring useful information about the hydrogen adsorption mechanisms in 
SWNTs and quantum sieving selectivities [26]. However, in this problem particular attention should be paid not 
only to the H2— H2 potential, but also to the many body interaction between H2 molecules and the carbon nanotube 
[27-28]. The inclusion of additional complex potentials in the Schrodinger equation may essentially increase the 
computation difficulties. 

It is also very attractive to upgrade the four-dimensional model for the linear rigid rotors used in this work to 
complete six-dimensional consideration of the H2+H2 collisions. However, because of two additional integrations 
over n and r2 distances such calculations should be very time consuming 



where v designates the vibrational quantum numbers v\ and V2 [29]. Nontheless, the application of a parallel 
computing techniques together with shared memory methodology could be a very effective computational approach, 
as it was partially demonstrated in this work. 

Although our calculations revealed, that both the H2— H2 PESs used in this work can provide the same type 
of behaviour in regard to cross sections and rates, there are still significant differences. Considering the results of 
these calculations we conclude that subsequent work is needed to further improve the H 2 — H 2 PES, and that work 
will require parallel processing if it is to be done in a timely manner. 
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